Overview

Dataset statistics

Number of variables24
Number of observations30000
Missing cells25062
Missing cells (%)3.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.5 MiB
Average record size in memory192.0 B

Variable types

Categorical12
Numeric10
Boolean2

Warnings

Customer ID has a high cardinality: 30000 distinct values High cardinality
Name has a high cardinality: 30000 distinct values High cardinality
Income (USD) is highly correlated with Property AgeHigh correlation
Loan Amount Request (USD) is highly correlated with Current Loan Expenses (USD) and 2 other fieldsHigh correlation
Current Loan Expenses (USD) is highly correlated with Loan Amount Request (USD) and 1 other fieldsHigh correlation
Property Age is highly correlated with Income (USD)High correlation
Property Price is highly correlated with Loan Amount Request (USD) and 2 other fieldsHigh correlation
Loan Sanction Amount (USD) is highly correlated with Loan Amount Request (USD) and 1 other fieldsHigh correlation
Income (USD) is highly correlated with Property AgeHigh correlation
Loan Amount Request (USD) is highly correlated with Current Loan Expenses (USD) and 2 other fieldsHigh correlation
Current Loan Expenses (USD) is highly correlated with Loan Amount Request (USD) and 1 other fieldsHigh correlation
Property Age is highly correlated with Income (USD)High correlation
Property Price is highly correlated with Loan Amount Request (USD) and 2 other fieldsHigh correlation
Loan Sanction Amount (USD) is highly correlated with Loan Amount Request (USD) and 1 other fieldsHigh correlation
Income (USD) is highly correlated with Property AgeHigh correlation
Loan Amount Request (USD) is highly correlated with Current Loan Expenses (USD) and 2 other fieldsHigh correlation
Current Loan Expenses (USD) is highly correlated with Loan Amount Request (USD) and 1 other fieldsHigh correlation
Property Age is highly correlated with Income (USD)High correlation
Property Price is highly correlated with Loan Amount Request (USD) and 1 other fieldsHigh correlation
Loan Sanction Amount (USD) is highly correlated with Loan Amount Request (USD)High correlation
Income Stability is highly correlated with Age and 1 other fieldsHigh correlation
Current Loan Expenses (USD) is highly correlated with Loan Sanction Amount (USD) and 2 other fieldsHigh correlation
Loan Sanction Amount (USD) is highly correlated with Current Loan Expenses (USD) and 2 other fieldsHigh correlation
Income (USD) is highly correlated with Property AgeHigh correlation
Property Price is highly correlated with Current Loan Expenses (USD) and 2 other fieldsHigh correlation
Age is highly correlated with Income StabilityHigh correlation
Profession is highly correlated with Income StabilityHigh correlation
Property Age is highly correlated with Income (USD)High correlation
Loan Amount Request (USD) is highly correlated with Current Loan Expenses (USD) and 2 other fieldsHigh correlation
Income Stability is highly correlated with Type of Employment and 1 other fieldsHigh correlation
Type of Employment is highly correlated with Income StabilityHigh correlation
Profession is highly correlated with Income StabilityHigh correlation
Income (USD) has 4576 (15.3%) missing values Missing
Income Stability has 1683 (5.6%) missing values Missing
Type of Employment has 7270 (24.2%) missing values Missing
Dependents has 2493 (8.3%) missing values Missing
Credit Score has 1703 (5.7%) missing values Missing
Has Active Credit Card has 1566 (5.2%) missing values Missing
Property Age has 4850 (16.2%) missing values Missing
Property Location has 356 (1.2%) missing values Missing
Loan Sanction Amount (USD) has 340 (1.1%) missing values Missing
Income (USD) is highly skewed (γ1 = 154.00172) Skewed
Property Age is highly skewed (γ1 = 153.2196101) Skewed
Customer ID is uniformly distributed Uniform
Name is uniformly distributed Uniform
Customer ID has unique values Unique
Name has unique values Unique
Loan Sanction Amount (USD) has 7865 (26.2%) zeros Zeros

Reproduction

Analysis started2021-06-27 06:18:31.349608
Analysis finished2021-06-27 06:19:32.772486
Duration1 minute and 1.42 second
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

Customer ID
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct30000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size234.5 KiB
C-14537
 
1
C-27652
 
1
C-3942
 
1
C-5210
 
1
C-21819
 
1
Other values (29995)
29995 

Length

Max length7
Median length7
Mean length6.7817
Min length3

Characters and Unicode

Total characters203451
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30000 ?
Unique (%)100.0%

Sample

1st rowC-36995
2nd rowC-33999
3rd rowC-3770
4th rowC-26480
5th rowC-23459

Common Values

ValueCountFrequency (%)
C-145371
 
< 0.1%
C-276521
 
< 0.1%
C-39421
 
< 0.1%
C-52101
 
< 0.1%
C-218191
 
< 0.1%
C-375181
 
< 0.1%
C-106861
 
< 0.1%
C-168991
 
< 0.1%
C-472131
 
< 0.1%
C-119411
 
< 0.1%
Other values (29990)29990
> 99.9%

Length

2021-06-27T11:49:33.360970image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
c-233261
 
< 0.1%
c-223061
 
< 0.1%
c-492521
 
< 0.1%
c-183141
 
< 0.1%
c-317811
 
< 0.1%
c-282211
 
< 0.1%
c-401061
 
< 0.1%
c-476171
 
< 0.1%
c-470621
 
< 0.1%
c-52341
 
< 0.1%
Other values (29990)29990
> 99.9%

Most occurring characters

ValueCountFrequency (%)
C30000
14.7%
-30000
14.7%
318132
8.9%
118078
8.9%
218076
8.9%
418055
8.9%
712108
6.0%
811965
 
5.9%
911955
 
5.9%
511888
 
5.8%
Other values (2)23194
11.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number143451
70.5%
Uppercase Letter30000
 
14.7%
Dash Punctuation30000
 
14.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
318132
12.6%
118078
12.6%
218076
12.6%
418055
12.6%
712108
8.4%
811965
8.3%
911955
8.3%
511888
8.3%
611851
8.3%
011343
7.9%
Uppercase Letter
ValueCountFrequency (%)
C30000
100.0%
Dash Punctuation
ValueCountFrequency (%)
-30000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common173451
85.3%
Latin30000
 
14.7%

Most frequent character per script

Common
ValueCountFrequency (%)
-30000
17.3%
318132
10.5%
118078
10.4%
218076
10.4%
418055
10.4%
712108
7.0%
811965
 
6.9%
911955
 
6.9%
511888
 
6.9%
611851
 
6.8%
Latin
ValueCountFrequency (%)
C30000
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII203451
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C30000
14.7%
-30000
14.7%
318132
8.9%
118078
8.9%
218076
8.9%
418055
8.9%
712108
6.0%
811965
 
5.9%
911955
 
5.9%
511888
 
5.8%
Other values (2)23194
11.4%

Name
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct30000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size234.5 KiB
Armida Lajeunesse
 
1
Lacey Devenport
 
1
Pamula Sackett
 
1
Maricela Brwon
 
1
Darlena Omalley
 
1
Other values (29995)
29995 

Length

Max length24
Median length13
Mean length13.52893333
Min length6

Characters and Unicode

Total characters405868
Distinct characters53
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30000 ?
Unique (%)100.0%

Sample

1st rowFrederica Shealy
2nd rowAmerica Calderone
3rd rowRosetta Verne
4th rowZoe Chitty
5th rowAfton Venema

Common Values

ValueCountFrequency (%)
Armida Lajeunesse1
 
< 0.1%
Lacey Devenport1
 
< 0.1%
Pamula Sackett1
 
< 0.1%
Maricela Brwon1
 
< 0.1%
Darlena Omalley1
 
< 0.1%
Genie Antoine1
 
< 0.1%
Glendora Calender1
 
< 0.1%
Margareta Koffler1
 
< 0.1%
Roxy Teamer1
 
< 0.1%
Nakita Marro1
 
< 0.1%
Other values (29990)29990
> 99.9%

Length

2021-06-27T11:49:34.049831image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
gina113
 
0.2%
sidney106
 
0.2%
teri102
 
0.2%
lorriane102
 
0.2%
jenni100
 
0.2%
noe100
 
0.2%
selena99
 
0.2%
margareta98
 
0.2%
desire98
 
0.2%
brian97
 
0.2%
Other values (2493)58985
98.3%

Most occurring characters

ValueCountFrequency (%)
e44455
 
11.0%
a40934
 
10.1%
30000
 
7.4%
n29296
 
7.2%
r27736
 
6.8%
i26855
 
6.6%
l24035
 
5.9%
o20028
 
4.9%
t14824
 
3.7%
s13009
 
3.2%
Other values (43)134696
33.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter315868
77.8%
Uppercase Letter60000
 
14.8%
Space Separator30000
 
7.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S5846
 
9.7%
M5481
 
9.1%
L4541
 
7.6%
B4139
 
6.9%
A3989
 
6.6%
C3862
 
6.4%
D3206
 
5.3%
K3163
 
5.3%
G2844
 
4.7%
R2721
 
4.5%
Other values (16)20208
33.7%
Lowercase Letter
ValueCountFrequency (%)
e44455
14.1%
a40934
13.0%
n29296
9.3%
r27736
8.8%
i26855
8.5%
l24035
 
7.6%
o20028
 
6.3%
t14824
 
4.7%
s13009
 
4.1%
d9474
 
3.0%
Other values (16)65222
20.6%
Space Separator
ValueCountFrequency (%)
30000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin375868
92.6%
Common30000
 
7.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e44455
 
11.8%
a40934
 
10.9%
n29296
 
7.8%
r27736
 
7.4%
i26855
 
7.1%
l24035
 
6.4%
o20028
 
5.3%
t14824
 
3.9%
s13009
 
3.5%
d9474
 
2.5%
Other values (42)125222
33.3%
Common
ValueCountFrequency (%)
30000
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII405868
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e44455
 
11.0%
a40934
 
10.1%
30000
 
7.4%
n29296
 
7.2%
r27736
 
6.8%
i26855
 
6.6%
l24035
 
5.9%
o20028
 
4.9%
t14824
 
3.7%
s13009
 
3.2%
Other values (43)134696
33.2%

Gender
Categorical

Distinct2
Distinct (%)< 0.1%
Missing53
Missing (%)0.2%
Memory size234.5 KiB
M
15053 
F
14894 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters29947
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF
2nd rowM
3rd rowF
4th rowF
5th rowF

Common Values

ValueCountFrequency (%)
M15053
50.2%
F14894
49.6%
(Missing)53
 
0.2%

Length

2021-06-27T11:49:34.675440image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:49:34.877851image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
m15053
50.3%
f14894
49.7%

Most occurring characters

ValueCountFrequency (%)
M15053
50.3%
F14894
49.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter29947
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M15053
50.3%
F14894
49.7%

Most occurring scripts

ValueCountFrequency (%)
Latin29947
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M15053
50.3%
F14894
49.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII29947
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M15053
50.3%
F14894
49.7%

Age
Real number (ℝ≥0)

HIGH CORRELATION

Distinct48
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.0923
Minimum18
Maximum65
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size234.5 KiB
2021-06-27T11:49:35.088039image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum18
5-th percentile18
Q125
median40
Q355
95-th percentile64
Maximum65
Range47
Interquartile range (IQR)30

Descriptive statistics

Standard deviation16.04512893
Coefficient of variation (CV)0.4002047507
Kurtosis-1.382082938
Mean40.0923
Median Absolute Deviation (MAD)15
Skewness0.0460938016
Sum1202769
Variance257.4461622
MonotonicityNot monotonic
2021-06-27T11:49:35.352281image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=48)
ValueCountFrequency (%)
184378
 
14.6%
651349
 
4.5%
64813
 
2.7%
61787
 
2.6%
62784
 
2.6%
60774
 
2.6%
63712
 
2.4%
57538
 
1.8%
47536
 
1.8%
44523
 
1.7%
Other values (38)18806
62.7%
ValueCountFrequency (%)
184378
14.6%
19482
 
1.6%
20499
 
1.7%
21510
 
1.7%
22469
 
1.6%
23520
 
1.7%
24503
 
1.7%
25470
 
1.6%
26513
 
1.7%
27511
 
1.7%
ValueCountFrequency (%)
651349
4.5%
64813
2.7%
63712
2.4%
62784
2.6%
61787
2.6%
60774
2.6%
59492
 
1.6%
58514
 
1.7%
57538
 
1.8%
56507
 
1.7%

Income (USD)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
SKEWED

Distinct24429
Distinct (%)96.1%
Missing4576
Missing (%)15.3%
Infinite0
Infinite (%)0.0%
Mean2630.574417
Minimum377.7
Maximum1777460.21
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size234.5 KiB
2021-06-27T11:49:35.644675image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum377.7
5-th percentile1034.6115
Q11650.4575
median2222.435
Q33090.5925
95-th percentile5071.825
Maximum1777460.21
Range1777082.51
Interquartile range (IQR)1440.135

Descriptive statistics

Standard deviation11262.72383
Coefficient of variation (CV)4.281469384
Kurtosis24259.66896
Mean2630.574417
Median Absolute Deviation (MAD)663.2
Skewness154.00172
Sum66879723.99
Variance126848948.1
MonotonicityNot monotonic
2021-06-27T11:49:35.945845image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2415.74
 
< 0.1%
1608.434
 
< 0.1%
1723.733
 
< 0.1%
2206.63
 
< 0.1%
1868.783
 
< 0.1%
2304.813
 
< 0.1%
1617.973
 
< 0.1%
1282.743
 
< 0.1%
1681.483
 
< 0.1%
2412.073
 
< 0.1%
Other values (24419)25392
84.6%
(Missing)4576
 
15.3%
ValueCountFrequency (%)
377.71
< 0.1%
378.761
< 0.1%
378.911
< 0.1%
393.091
< 0.1%
418.91
< 0.1%
424.451
< 0.1%
437.631
< 0.1%
438.441
< 0.1%
442.471
< 0.1%
450.161
< 0.1%
ValueCountFrequency (%)
1777460.211
< 0.1%
122966.281
< 0.1%
54653.751
< 0.1%
48095.161
< 0.1%
32726.981
< 0.1%
31866.971
< 0.1%
31584.051
< 0.1%
28331.741
< 0.1%
27518.641
< 0.1%
26049.851
< 0.1%

Income Stability
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct2
Distinct (%)< 0.1%
Missing1683
Missing (%)5.6%
Memory size234.5 KiB
Low
25751 
High
 
2566

Length

Max length4
Median length3
Mean length3.090616944
Min length3

Characters and Unicode

Total characters87517
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLow
2nd rowLow
3rd rowHigh
4th rowHigh
5th rowLow

Common Values

ValueCountFrequency (%)
Low25751
85.8%
High2566
 
8.6%
(Missing)1683
 
5.6%

Length

2021-06-27T11:49:36.505388image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:49:36.678731image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
low25751
90.9%
high2566
 
9.1%

Most occurring characters

ValueCountFrequency (%)
L25751
29.4%
o25751
29.4%
w25751
29.4%
H2566
 
2.9%
i2566
 
2.9%
g2566
 
2.9%
h2566
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter59200
67.6%
Uppercase Letter28317
32.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o25751
43.5%
w25751
43.5%
i2566
 
4.3%
g2566
 
4.3%
h2566
 
4.3%
Uppercase Letter
ValueCountFrequency (%)
L25751
90.9%
H2566
 
9.1%

Most occurring scripts

ValueCountFrequency (%)
Latin87517
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
L25751
29.4%
o25751
29.4%
w25751
29.4%
H2566
 
2.9%
i2566
 
2.9%
g2566
 
2.9%
h2566
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII87517
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
L25751
29.4%
o25751
29.4%
w25751
29.4%
H2566
 
2.9%
i2566
 
2.9%
g2566
 
2.9%
h2566
 
2.9%

Profession
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size234.5 KiB
Working
16926 
Commercial associate
7962 
Pensioner
2740 
State servant
2366 
Unemployed
 
2
Other values (3)
 
4

Length

Max length20
Median length7
Mean length11.1068
Min length7

Characters and Unicode

Total characters333204
Distinct characters26
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowWorking
2nd rowWorking
3rd rowPensioner
4th rowPensioner
5th rowWorking

Common Values

ValueCountFrequency (%)
Working16926
56.4%
Commercial associate7962
26.5%
Pensioner2740
 
9.1%
State servant2366
 
7.9%
Unemployed2
 
< 0.1%
Businessman2
 
< 0.1%
Student1
 
< 0.1%
Maternity leave1
 
< 0.1%

Length

2021-06-27T11:49:37.244673image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:49:37.466631image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
working16926
42.0%
commercial7962
19.7%
associate7962
19.7%
pensioner2740
 
6.8%
servant2366
 
5.9%
state2366
 
5.9%
unemployed2
 
< 0.1%
businessman2
 
< 0.1%
student1
 
< 0.1%
maternity1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
i35593
10.7%
o35592
10.7%
r29995
 
9.0%
a28622
 
8.6%
e26146
 
7.8%
n24780
 
7.4%
s21036
 
6.3%
W16926
 
5.1%
k16926
 
5.1%
g16926
 
5.1%
Other values (16)80662
24.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter292875
87.9%
Uppercase Letter30000
 
9.0%
Space Separator10329
 
3.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i35593
12.2%
o35592
12.2%
r29995
10.2%
a28622
9.8%
e26146
8.9%
n24780
8.5%
s21036
7.2%
k16926
5.8%
g16926
5.8%
m15928
 
5.4%
Other values (8)41331
14.1%
Uppercase Letter
ValueCountFrequency (%)
W16926
56.4%
C7962
26.5%
P2740
 
9.1%
S2367
 
7.9%
U2
 
< 0.1%
B2
 
< 0.1%
M1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
10329
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin322875
96.9%
Common10329
 
3.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i35593
11.0%
o35592
11.0%
r29995
9.3%
a28622
8.9%
e26146
 
8.1%
n24780
 
7.7%
s21036
 
6.5%
W16926
 
5.2%
k16926
 
5.2%
g16926
 
5.2%
Other values (15)70333
21.8%
Common
ValueCountFrequency (%)
10329
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII333204
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i35593
10.7%
o35592
10.7%
r29995
 
9.0%
a28622
 
8.6%
e26146
 
7.8%
n24780
 
7.4%
s21036
 
6.3%
W16926
 
5.1%
k16926
 
5.1%
g16926
 
5.1%
Other values (16)80662
24.2%

Type of Employment
Categorical

HIGH CORRELATION
MISSING

Distinct18
Distinct (%)0.1%
Missing7270
Missing (%)24.2%
Memory size234.5 KiB
Laborers
5578 
Sales staff
3736 
Core staff
3230 
Managers
2495 
Drivers
1606 
Other values (13)
6085 

Length

Max length21
Median length10
Mean length10.61728993
Min length7

Characters and Unicode

Total characters241331
Distinct characters36
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSales staff
2nd rowHigh skill tech staff
3rd rowSecretaries
4th rowLaborers
5th rowManagers

Common Values

ValueCountFrequency (%)
Laborers5578
18.6%
Sales staff3736
12.5%
Core staff3230
10.8%
Managers2495
 
8.3%
Drivers1606
 
5.4%
Accountants1379
 
4.6%
High skill tech staff1307
 
4.4%
Medicine staff864
 
2.9%
Security staff579
 
1.9%
Cooking staff566
 
1.9%
Other values (8)1390
 
4.6%
(Missing)7270
24.2%

Length

2021-06-27T11:49:38.306394image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
staff11263
30.3%
laborers5740
15.4%
sales3736
 
10.0%
core3230
 
8.7%
managers2495
 
6.7%
drivers1606
 
4.3%
accountants1379
 
3.7%
skill1307
 
3.5%
tech1307
 
3.5%
high1307
 
3.5%
Other values (13)3827
 
10.3%

Most occurring characters

ValueCountFrequency (%)
s28426
11.8%
a28422
11.8%
e22741
 
9.4%
f22526
 
9.3%
r22300
 
9.2%
t16731
 
6.9%
14467
 
6.0%
o11643
 
4.8%
i8590
 
3.6%
n7600
 
3.1%
Other values (26)57885
24.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter203512
84.3%
Uppercase Letter23041
 
9.5%
Space Separator14467
 
6.0%
Dash Punctuation162
 
0.1%
Other Punctuation149
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s28426
14.0%
a28422
14.0%
e22741
11.2%
f22526
11.1%
r22300
11.0%
t16731
8.2%
o11643
5.7%
i8590
 
4.2%
n7600
 
3.7%
l7101
 
3.5%
Other values (11)27432
13.5%
Uppercase Letter
ValueCountFrequency (%)
L5902
25.6%
S4476
19.4%
C4137
18.0%
M3359
14.6%
D1606
 
7.0%
H1379
 
6.0%
A1379
 
6.0%
P342
 
1.5%
R158
 
0.7%
W149
 
0.6%
Other values (2)154
 
0.7%
Space Separator
ValueCountFrequency (%)
14467
100.0%
Other Punctuation
ValueCountFrequency (%)
/149
100.0%
Dash Punctuation
ValueCountFrequency (%)
-162
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin226553
93.9%
Common14778
 
6.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
s28426
12.5%
a28422
12.5%
e22741
10.0%
f22526
9.9%
r22300
9.8%
t16731
 
7.4%
o11643
 
5.1%
i8590
 
3.8%
n7600
 
3.4%
l7101
 
3.1%
Other values (23)50473
22.3%
Common
ValueCountFrequency (%)
14467
97.9%
-162
 
1.1%
/149
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII241331
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s28426
11.8%
a28422
11.8%
e22741
 
9.4%
f22526
 
9.3%
r22300
 
9.2%
t16731
 
6.9%
14467
 
6.0%
o11643
 
4.8%
i8590
 
3.6%
n7600
 
3.1%
Other values (26)57885
24.0%

Location
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size234.5 KiB
Semi-Urban
21563 
Rural
5338 
Urban
3099 

Length

Max length10
Median length10
Mean length8.593833333
Min length5

Characters and Unicode

Total characters257815
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSemi-Urban
2nd rowSemi-Urban
3rd rowSemi-Urban
4th rowRural
5th rowSemi-Urban

Common Values

ValueCountFrequency (%)
Semi-Urban21563
71.9%
Rural5338
 
17.8%
Urban3099
 
10.3%

Length

2021-06-27T11:49:38.942992image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:49:39.112322image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
semi-urban21563
71.9%
rural5338
 
17.8%
urban3099
 
10.3%

Most occurring characters

ValueCountFrequency (%)
r30000
11.6%
a30000
11.6%
U24662
9.6%
b24662
9.6%
n24662
9.6%
S21563
8.4%
e21563
8.4%
m21563
8.4%
i21563
8.4%
-21563
8.4%
Other values (3)16014
6.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter184689
71.6%
Uppercase Letter51563
 
20.0%
Dash Punctuation21563
 
8.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r30000
16.2%
a30000
16.2%
b24662
13.4%
n24662
13.4%
e21563
11.7%
m21563
11.7%
i21563
11.7%
u5338
 
2.9%
l5338
 
2.9%
Uppercase Letter
ValueCountFrequency (%)
U24662
47.8%
S21563
41.8%
R5338
 
10.4%
Dash Punctuation
ValueCountFrequency (%)
-21563
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin236252
91.6%
Common21563
 
8.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
r30000
12.7%
a30000
12.7%
U24662
10.4%
b24662
10.4%
n24662
10.4%
S21563
9.1%
e21563
9.1%
m21563
9.1%
i21563
9.1%
R5338
 
2.3%
Other values (2)10676
 
4.5%
Common
ValueCountFrequency (%)
-21563
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII257815
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r30000
11.6%
a30000
11.6%
U24662
9.6%
b24662
9.6%
n24662
9.6%
S21563
8.4%
e21563
8.4%
m21563
8.4%
i21563
8.4%
-21563
8.4%
Other values (3)16014
6.2%

Loan Amount Request (USD)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct29982
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean88826.33386
Minimum6048.24
Maximum621497.82
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size234.5 KiB
2021-06-27T11:49:39.346794image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum6048.24
5-th percentile21009.856
Q141177.755
median75128.075
Q3119964.605
95-th percentile203686.565
Maximum621497.82
Range615449.58
Interquartile range (IQR)78786.85

Descriptive statistics

Standard deviation59536.9496
Coefficient of variation (CV)0.6702623763
Kurtosis2.103379537
Mean88826.33386
Median Absolute Deviation (MAD)36681.98
Skewness1.260392183
Sum2664790016
Variance3544648368
MonotonicityNot monotonic
2021-06-27T11:49:39.649254image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
70410.472
 
< 0.1%
81346.592
 
< 0.1%
30810.812
 
< 0.1%
20323.292
 
< 0.1%
68353.332
 
< 0.1%
73908.662
 
< 0.1%
115859.92
 
< 0.1%
60315.372
 
< 0.1%
101644.52
 
< 0.1%
43921.712
 
< 0.1%
Other values (29972)29980
99.9%
ValueCountFrequency (%)
6048.241
< 0.1%
6108.051
< 0.1%
6145.011
< 0.1%
6174.71
< 0.1%
6189.51
< 0.1%
6307.151
< 0.1%
6310.261
< 0.1%
6341.021
< 0.1%
6431.371
< 0.1%
6436.141
< 0.1%
ValueCountFrequency (%)
621497.821
< 0.1%
602384.151
< 0.1%
564812.491
< 0.1%
447860.561
< 0.1%
417936.911
< 0.1%
417580.721
< 0.1%
412229.231
< 0.1%
408413.21
< 0.1%
401084.81
< 0.1%
398662.891
< 0.1%

Current Loan Expenses (USD)
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct24041
Distinct (%)80.6%
Missing172
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean400.9368764
Minimum-999
Maximum3840.88
Zeros0
Zeros (%)0.0%
Negative177
Negative (%)0.6%
Memory size234.5 KiB
2021-06-27T11:49:40.363363image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-999
5-th percentile127.3635
Q1247.6675
median375.205
Q3521.2925
95-th percentile799.7755
Maximum3840.88
Range4839.88
Interquartile range (IQR)273.625

Descriptive statistics

Standard deviation242.5453749
Coefficient of variation (CV)0.6049465369
Kurtosis10.75107265
Mean400.9368764
Median Absolute Deviation (MAD)134.82
Skewness0.04432815526
Sum11959145.15
Variance58828.25889
MonotonicityNot monotonic
2021-06-27T11:49:40.695470image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-999177
 
0.6%
333.796
 
< 0.1%
366.376
 
< 0.1%
185.95
 
< 0.1%
323.125
 
< 0.1%
430.935
 
< 0.1%
340.395
 
< 0.1%
408.454
 
< 0.1%
233.954
 
< 0.1%
400.664
 
< 0.1%
Other values (24031)29607
98.7%
(Missing)172
 
0.6%
ValueCountFrequency (%)
-999177
0.6%
33.761
 
< 0.1%
34.041
 
< 0.1%
39.611
 
< 0.1%
42.131
 
< 0.1%
43.091
 
< 0.1%
44.231
 
< 0.1%
44.631
 
< 0.1%
47.781
 
< 0.1%
48.21
 
< 0.1%
ValueCountFrequency (%)
3840.881
< 0.1%
3419.661
< 0.1%
3025.41
< 0.1%
3018.151
< 0.1%
2512.021
< 0.1%
2481.221
< 0.1%
2052.851
< 0.1%
2035.341
< 0.1%
1995.571
< 0.1%
1993.871
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size29.4 KiB
False
19214 
True
10786 
ValueCountFrequency (%)
False19214
64.0%
True10786
36.0%
2021-06-27T11:49:40.958541image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size29.4 KiB
True
20180 
False
9820 
ValueCountFrequency (%)
True20180
67.3%
False9820
32.7%
2021-06-27T11:49:41.078162image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Dependents
Real number (ℝ≥0)

MISSING

Distinct10
Distinct (%)< 0.1%
Missing2493
Missing (%)8.3%
Infinite0
Infinite (%)0.0%
Mean2.253026502
Minimum1
Maximum14
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size234.5 KiB
2021-06-27T11:49:41.257792image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q33
95-th percentile4
Maximum14
Range13
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.9511621081
Coefficient of variation (CV)0.4221708476
Kurtosis1.398534
Mean2.253026502
Median Absolute Deviation (MAD)1
Skewness0.8047511637
Sum61974
Variance0.9047093559
MonotonicityNot monotonic
2021-06-27T11:49:41.552245image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
213108
43.7%
35719
19.1%
15544
18.5%
42704
 
9.0%
5372
 
1.2%
650
 
0.2%
77
 
< 0.1%
141
 
< 0.1%
101
 
< 0.1%
81
 
< 0.1%
(Missing)2493
 
8.3%
ValueCountFrequency (%)
15544
18.5%
213108
43.7%
35719
19.1%
42704
 
9.0%
5372
 
1.2%
650
 
0.2%
77
 
< 0.1%
81
 
< 0.1%
101
 
< 0.1%
141
 
< 0.1%
ValueCountFrequency (%)
141
 
< 0.1%
101
 
< 0.1%
81
 
< 0.1%
77
 
< 0.1%
650
 
0.2%
5372
 
1.2%
42704
 
9.0%
35719
19.1%
213108
43.7%
15544
18.5%

Credit Score
Real number (ℝ≥0)

MISSING

Distinct17586
Distinct (%)62.1%
Missing1703
Missing (%)5.7%
Infinite0
Infinite (%)0.0%
Mean739.8853811
Minimum580
Maximum896.26
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size234.5 KiB
2021-06-27T11:49:41.897086image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum580
5-th percentile624.372
Q1681.88
median739.82
Q3799.12
95-th percentile854.78
Maximum896.26
Range316.26
Interquartile range (IQR)117.24

Descriptive statistics

Standard deviation72.1638461
Coefficient of variation (CV)0.09753381799
Kurtosis-0.9851944799
Mean739.8853811
Median Absolute Deviation (MAD)58.67
Skewness-0.02025516921
Sum20936536.63
Variance5207.620684
MonotonicityNot monotonic
2021-06-27T11:49:42.180524image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
792.769
 
< 0.1%
811.458
 
< 0.1%
844.267
 
< 0.1%
727.727
 
< 0.1%
777.867
 
< 0.1%
760.696
 
< 0.1%
782.96
 
< 0.1%
774.116
 
< 0.1%
735.86
 
< 0.1%
757.776
 
< 0.1%
Other values (17576)28229
94.1%
(Missing)1703
 
5.7%
ValueCountFrequency (%)
5801
< 0.1%
580.851
< 0.1%
581.661
< 0.1%
582.31
< 0.1%
582.621
< 0.1%
583.052
< 0.1%
583.131
< 0.1%
583.341
< 0.1%
583.421
< 0.1%
583.541
< 0.1%
ValueCountFrequency (%)
896.261
< 0.1%
892.181
< 0.1%
891.951
< 0.1%
891.751
< 0.1%
891.321
< 0.1%
890.021
< 0.1%
889.791
< 0.1%
889.721
< 0.1%
889.711
< 0.1%
889.242
< 0.1%

No. of Defaults
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size234.5 KiB
0
24182 
1
5818 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters30000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row1

Common Values

ValueCountFrequency (%)
024182
80.6%
15818
 
19.4%

Length

2021-06-27T11:49:42.894682image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:49:43.073138image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
024182
80.6%
15818
 
19.4%

Most occurring characters

ValueCountFrequency (%)
024182
80.6%
15818
 
19.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number30000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
024182
80.6%
15818
 
19.4%

Most occurring scripts

ValueCountFrequency (%)
Common30000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
024182
80.6%
15818
 
19.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII30000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
024182
80.6%
15818
 
19.4%

Has Active Credit Card
Categorical

MISSING

Distinct3
Distinct (%)< 0.1%
Missing1566
Missing (%)5.2%
Memory size234.5 KiB
Active
9771 
Inactive
9466 
Unpossessed
9197 

Length

Max length11
Median length8
Mean length8.283076598
Min length6

Characters and Unicode

Total characters235521
Distinct characters14
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUnpossessed
2nd rowUnpossessed
3rd rowUnpossessed
4th rowActive
5th rowInactive

Common Values

ValueCountFrequency (%)
Active9771
32.6%
Inactive9466
31.6%
Unpossessed9197
30.7%
(Missing)1566
 
5.2%

Length

2021-06-27T11:49:43.578968image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:49:43.801020image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
active9771
34.4%
inactive9466
33.3%
unpossessed9197
32.3%

Most occurring characters

ValueCountFrequency (%)
e37631
16.0%
s36788
15.6%
c19237
8.2%
t19237
8.2%
i19237
8.2%
v19237
8.2%
n18663
7.9%
A9771
 
4.1%
I9466
 
4.0%
a9466
 
4.0%
Other values (4)36788
15.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter207087
87.9%
Uppercase Letter28434
 
12.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e37631
18.2%
s36788
17.8%
c19237
9.3%
t19237
9.3%
i19237
9.3%
v19237
9.3%
n18663
9.0%
a9466
 
4.6%
p9197
 
4.4%
o9197
 
4.4%
Uppercase Letter
ValueCountFrequency (%)
A9771
34.4%
I9466
33.3%
U9197
32.3%

Most occurring scripts

ValueCountFrequency (%)
Latin235521
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e37631
16.0%
s36788
15.6%
c19237
8.2%
t19237
8.2%
i19237
8.2%
v19237
8.2%
n18663
7.9%
A9771
 
4.1%
I9466
 
4.0%
a9466
 
4.0%
Other values (4)36788
15.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII235521
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e37631
16.0%
s36788
15.6%
c19237
8.2%
t19237
8.2%
i19237
8.2%
v19237
8.2%
n18663
7.9%
A9771
 
4.1%
I9466
 
4.0%
a9466
 
4.0%
Other values (4)36788
15.6%

Property ID
Real number (ℝ≥0)

Distinct999
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean501.9347
Minimum1
Maximum999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size234.5 KiB
2021-06-27T11:49:44.124005image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile52
Q1251
median504
Q3751
95-th percentile949
Maximum999
Range998
Interquartile range (IQR)500

Descriptive statistics

Standard deviation288.1580858
Coefficient of variation (CV)0.5740947693
Kurtosis-1.202563661
Mean501.9347
Median Absolute Deviation (MAD)250
Skewness-0.01051918972
Sum15058041
Variance83035.08241
MonotonicityNot monotonic
2021-06-27T11:49:44.509102image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
83951
 
0.2%
94446
 
0.2%
10246
 
0.2%
61445
 
0.1%
78745
 
0.1%
87045
 
0.1%
27844
 
0.1%
18543
 
0.1%
76843
 
0.1%
38243
 
0.1%
Other values (989)29549
98.5%
ValueCountFrequency (%)
132
0.1%
221
0.1%
328
0.1%
434
0.1%
530
0.1%
631
0.1%
726
0.1%
833
0.1%
926
0.1%
1029
0.1%
ValueCountFrequency (%)
99926
0.1%
99821
0.1%
99729
0.1%
99627
0.1%
99528
0.1%
99428
0.1%
99332
0.1%
99235
0.1%
99136
0.1%
99030
0.1%

Property Age
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
SKEWED

Distinct24179
Distinct (%)96.1%
Missing4850
Missing (%)16.2%
Infinite0
Infinite (%)0.0%
Mean2631.11944
Minimum377.7
Maximum1777460.21
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size234.5 KiB
2021-06-27T11:49:45.046318image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum377.7
5-th percentile1035.066
Q11650.45
median2223.25
Q33091.4075
95-th percentile5067.6665
Maximum1777460.21
Range1777082.51
Interquartile range (IQR)1440.9575

Descriptive statistics

Standard deviation11322.677
Coefficient of variation (CV)4.303368684
Kurtosis24008.67727
Mean2631.11944
Median Absolute Deviation (MAD)664.23
Skewness153.2196101
Sum66172653.91
Variance128203014.5
MonotonicityNot monotonic
2021-06-27T11:49:45.523815image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1608.434
 
< 0.1%
2415.74
 
< 0.1%
2104.263
 
< 0.1%
2412.073
 
< 0.1%
2114.523
 
< 0.1%
2052.943
 
< 0.1%
2304.813
 
< 0.1%
2013.113
 
< 0.1%
1946.613
 
< 0.1%
929.153
 
< 0.1%
Other values (24169)25118
83.7%
(Missing)4850
 
16.2%
ValueCountFrequency (%)
377.71
< 0.1%
378.761
< 0.1%
378.911
< 0.1%
393.091
< 0.1%
418.91
< 0.1%
424.451
< 0.1%
437.631
< 0.1%
438.441
< 0.1%
442.471
< 0.1%
450.161
< 0.1%
ValueCountFrequency (%)
1777460.211
< 0.1%
122966.281
< 0.1%
54653.751
< 0.1%
48095.161
< 0.1%
32726.981
< 0.1%
31866.971
< 0.1%
31584.051
< 0.1%
28331.741
< 0.1%
27518.641
< 0.1%
26049.851
< 0.1%

Property Type
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size234.5 KiB
1
7863 
2
7650 
3
7309 
4
7178 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters30000
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row2
3rd row2
4th row2
5th row4

Common Values

ValueCountFrequency (%)
17863
26.2%
27650
25.5%
37309
24.4%
47178
23.9%

Length

2021-06-27T11:49:46.215695image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:49:46.427527image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
17863
26.2%
27650
25.5%
37309
24.4%
47178
23.9%

Most occurring characters

ValueCountFrequency (%)
17863
26.2%
27650
25.5%
37309
24.4%
47178
23.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number30000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
17863
26.2%
27650
25.5%
37309
24.4%
47178
23.9%

Most occurring scripts

ValueCountFrequency (%)
Common30000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
17863
26.2%
27650
25.5%
37309
24.4%
47178
23.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII30000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
17863
26.2%
27650
25.5%
37309
24.4%
47178
23.9%

Property Location
Categorical

MISSING

Distinct3
Distinct (%)< 0.1%
Missing356
Missing (%)1.2%
Memory size234.5 KiB
Semi-Urban
10387 
Rural
10041 
Urban
9216 

Length

Max length10
Median length5
Mean length6.751956551
Min length5

Characters and Unicode

Total characters200155
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRural
2nd rowRural
3rd rowUrban
4th rowSemi-Urban
5th rowSemi-Urban

Common Values

ValueCountFrequency (%)
Semi-Urban10387
34.6%
Rural10041
33.5%
Urban9216
30.7%
(Missing)356
 
1.2%

Length

2021-06-27T11:49:47.111278image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:49:47.311841image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
semi-urban10387
35.0%
rural10041
33.9%
urban9216
31.1%

Most occurring characters

ValueCountFrequency (%)
r29644
14.8%
a29644
14.8%
U19603
9.8%
b19603
9.8%
n19603
9.8%
S10387
 
5.2%
e10387
 
5.2%
m10387
 
5.2%
i10387
 
5.2%
-10387
 
5.2%
Other values (3)30123
15.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter149737
74.8%
Uppercase Letter40031
 
20.0%
Dash Punctuation10387
 
5.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r29644
19.8%
a29644
19.8%
b19603
13.1%
n19603
13.1%
e10387
 
6.9%
m10387
 
6.9%
i10387
 
6.9%
u10041
 
6.7%
l10041
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
U19603
49.0%
S10387
25.9%
R10041
25.1%
Dash Punctuation
ValueCountFrequency (%)
-10387
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin189768
94.8%
Common10387
 
5.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
r29644
15.6%
a29644
15.6%
U19603
10.3%
b19603
10.3%
n19603
10.3%
S10387
 
5.5%
e10387
 
5.5%
m10387
 
5.5%
i10387
 
5.5%
R10041
 
5.3%
Other values (2)20082
10.6%
Common
ValueCountFrequency (%)
-10387
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII200155
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r29644
14.8%
a29644
14.8%
U19603
9.8%
b19603
9.8%
n19603
9.8%
S10387
 
5.2%
e10387
 
5.2%
m10387
 
5.2%
i10387
 
5.2%
-10387
 
5.2%
Other values (3)30123
15.0%

Co-Applicant
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size234.5 KiB
1
25516 
0
4316 
-999
 
168

Length

Max length4
Median length1
Mean length1.0168
Min length1

Characters and Unicode

Total characters30504
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row1
5th row1

Common Values

ValueCountFrequency (%)
125516
85.1%
04316
 
14.4%
-999168
 
0.6%

Length

2021-06-27T11:49:47.956126image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:49:48.174112image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
125516
85.1%
04316
 
14.4%
999168
 
0.6%

Most occurring characters

ValueCountFrequency (%)
125516
83.6%
04316
 
14.1%
9504
 
1.7%
-168
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number30336
99.4%
Dash Punctuation168
 
0.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
125516
84.1%
04316
 
14.2%
9504
 
1.7%
Dash Punctuation
ValueCountFrequency (%)
-168
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common30504
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
125516
83.6%
04316
 
14.1%
9504
 
1.7%
-168
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII30504
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
125516
83.6%
04316
 
14.1%
9504
 
1.7%
-168
 
0.6%

Property Price
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct29632
Distinct (%)98.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean131759.6803
Minimum-999
Maximum1077966.73
Zeros0
Zeros (%)0.0%
Negative352
Negative (%)1.2%
Memory size234.5 KiB
2021-06-27T11:49:48.436400image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-999
5-th percentile27945.385
Q160572.16
median109993.61
Q3178880.72
95-th percentile315010.661
Maximum1077966.73
Range1078965.73
Interquartile range (IQR)118308.56

Descriptive statistics

Standard deviation93549.5481
Coefficient of variation (CV)0.7100013291
Kurtosis3.149422456
Mean131759.6803
Median Absolute Deviation (MAD)54675.335
Skewness1.41696463
Sum3952790408
Variance8751517951
MonotonicityNot monotonic
2021-06-27T11:49:48.756961image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-999352
 
1.2%
85907.52
 
< 0.1%
51395.672
 
< 0.1%
356356.482
 
< 0.1%
335541.912
 
< 0.1%
51652.522
 
< 0.1%
77957.352
 
< 0.1%
28252.492
 
< 0.1%
57914.082
 
< 0.1%
53075.162
 
< 0.1%
Other values (29622)29630
98.8%
ValueCountFrequency (%)
-999352
1.2%
7265.951
 
< 0.1%
7309.871
 
< 0.1%
7439.121
 
< 0.1%
7859.621
 
< 0.1%
7900.341
 
< 0.1%
8012.241
 
< 0.1%
8029.941
 
< 0.1%
8068.741
 
< 0.1%
8231.391
 
< 0.1%
ValueCountFrequency (%)
1077966.731
< 0.1%
1028082.641
< 0.1%
987770.871
< 0.1%
769438.431
< 0.1%
736792.321
< 0.1%
683713.791
< 0.1%
680125.071
< 0.1%
678932.621
< 0.1%
674193.11
< 0.1%
667386.911
< 0.1%

Loan Sanction Amount (USD)
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct21450
Distinct (%)72.3%
Missing340
Missing (%)1.1%
Infinite0
Infinite (%)0.0%
Mean47649.34221
Minimum-999
Maximum481907.32
Zeros7865
Zeros (%)26.2%
Negative338
Negative (%)1.1%
Memory size234.5 KiB
2021-06-27T11:49:49.084895image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-999
5-th percentile0
Q10
median35209.395
Q374261.25
95-th percentile141339.9685
Maximum481907.32
Range482906.32
Interquartile range (IQR)74261.25

Descriptive statistics

Standard deviation48221.14669
Coefficient of variation (CV)1.01200026
Kurtosis1.760122431
Mean47649.34221
Median Absolute Deviation (MAD)35209.395
Skewness1.229939031
Sum1413279490
Variance2325278988
MonotonicityNot monotonic
2021-06-27T11:49:49.397990image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
07865
 
26.2%
-999338
 
1.1%
28937.453
 
< 0.1%
27394.932
 
< 0.1%
26047.922
 
< 0.1%
27767.182
 
< 0.1%
15270.222
 
< 0.1%
54850.852
 
< 0.1%
105047.722
 
< 0.1%
47136.422
 
< 0.1%
Other values (21440)21440
71.5%
(Missing)340
 
1.1%
ValueCountFrequency (%)
-999338
 
1.1%
07865
26.2%
4023.181
 
< 0.1%
4121.661
 
< 0.1%
4183.491
 
< 0.1%
4319.611
 
< 0.1%
4417.181
 
< 0.1%
4445.541
 
< 0.1%
4482.971
 
< 0.1%
4488.021
 
< 0.1%
ValueCountFrequency (%)
481907.321
< 0.1%
395368.741
< 0.1%
326730.561
< 0.1%
323233.311
< 0.1%
313502.391
< 0.1%
313452.681
< 0.1%
301457.21
< 0.1%
293150.141
< 0.1%
292666.611
< 0.1%
289506.991
< 0.1%

Interactions

2021-06-27T11:48:52.625936image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:48:52.940068image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:48:53.261177image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:48:53.556045image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:48:53.848821image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:48:54.145518image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:48:54.435861image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:48:54.700012image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:48:55.013515image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:48:55.299231image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:48:55.583312image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:48:55.957744image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:48:56.463612image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:48:57.108185image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:48:57.533701image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:48:57.919198image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:48:58.266550image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:48:58.588213image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:48:58.950598image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:48:59.334580image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:48:59.873745image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:00.221888image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:00.620093image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:00.943580image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:01.257484image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:01.624361image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:02.063710image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:02.504084image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:02.902107image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:03.254979image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:03.568160image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:03.891429image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:04.237796image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:04.581495image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:05.013522image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:05.534022image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:06.030714image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:06.421528image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:06.889110image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:07.311427image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:07.648161image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:07.933033image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:08.265140image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:08.589138image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:08.938537image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:09.352413image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:09.687383image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:09.982059image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:10.315038image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:10.625980image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:10.943195image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:11.229321image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:11.572039image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:11.935180image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:12.585741image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:13.011464image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:13.418672image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:13.719782image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:14.054528image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:14.367412image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:14.677728image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:14.952392image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:15.274969image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:15.577291image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:15.869550image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:16.176650image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:16.470841image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:16.769219image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:17.132539image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:17.454804image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:17.746099image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:18.075366image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:18.430708image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:18.784006image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:19.117652image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:19.448937image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:19.783498image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:20.116471image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:20.469417image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:20.801200image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:21.190268image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:21.475125image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:21.818487image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:22.144117image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:22.446423image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:22.749300image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:23.064283image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:23.357016image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:23.689713image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:24.007195image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:24.320211image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:24.600524image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:24.966665image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:25.282081image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:25.584383image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:26.253703image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:26.606916image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:26.919408image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:27.300121image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:49:27.613091image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-06-27T11:49:49.710962image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-06-27T11:49:50.348546image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-06-27T11:49:50.980468image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-06-27T11:49:51.727079image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-06-27T11:49:52.763665image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-06-27T11:49:28.328984image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-06-27T11:49:30.524985image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-06-27T11:49:31.517395image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-06-27T11:49:32.219527image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

Customer IDNameGenderAgeIncome (USD)Income StabilityProfessionType of EmploymentLocationLoan Amount Request (USD)Current Loan Expenses (USD)Expense Type 1Expense Type 2DependentsCredit ScoreNo. of DefaultsHas Active Credit CardProperty IDProperty AgeProperty TypeProperty LocationCo-ApplicantProperty PriceLoan Sanction Amount (USD)
0C-36995Frederica ShealyF561933.05LowWorkingSales staffSemi-Urban72809.58241.08NN3.0809.440NaN7461933.054Rural1119933.4654607.18
1C-33999America CalderoneM324952.91LowWorkingNaNSemi-Urban46837.47495.81NY1.0780.400Unpossessed6084952.912Rural154791.0037469.98
2C-3770Rosetta VerneF65988.19HighPensionerNaNSemi-Urban45593.04171.95NY1.0833.150Unpossessed546988.192Urban072440.5836474.43
3C-26480Zoe ChittyF65NaNHighPensionerNaNRural80057.92298.54NY2.0832.701Unpossessed890NaN2Semi-Urban1121441.5156040.54
4C-23459Afton VenemaF312614.77LowWorkingHigh skill tech staffSemi-Urban113858.89491.41NYNaN745.551Active7152614.774Semi-Urban1208567.9174008.28
5C-17688Polly CrumplerF601234.92LowState servantSecretariesRural34434.72181.48NN2.0684.121Inactive4911234.922Rural143146.8222382.57
6C-23855Nathalie OlivierM432361.56LowWorkingLaborersSemi-Urban152561.34697.67YY2.0637.290Unpossessed2272361.561Semi-Urban1221050.800.00
7C-11006Clarinda MontanaF45NaNLowState servantManagersSemi-Urban240311.77807.64NN2.0812.260Active314NaN2Urban1401040.70168218.24
8C-26934Kenny AnkromF381296.07LowWorkingCooking staffRural35141.99155.95NY3.0705.291Active2411296.074Rural154903.4422842.29
9C-24944Barbie GoetschM181546.17LowWorkingLaborersRural42091.29500.20NN2.0613.240Unpossessed8831546.172Urban167993.430.00

Last rows

Customer IDNameGenderAgeIncome (USD)Income StabilityProfessionType of EmploymentLocationLoan Amount Request (USD)Current Loan Expenses (USD)Expense Type 1Expense Type 2DependentsCredit ScoreNo. of DefaultsHas Active Credit CardProperty IDProperty AgeProperty TypeProperty LocationCo-ApplicantProperty PriceLoan Sanction Amount (USD)
29990C-49100Chantel CostiganF644211.81LowCommercial associateCooking staffUrban225434.35867.88NN3.0842.810Unpossessed4084211.811Urban1285966.630.00
29991C-20609Barbie HiceF55NaNLowCommercial associateHigh skill tech staffSemi-Urban127571.87707.74NY2.0NaN0Unpossessed213NaN4Urban1195998.6995678.90
29992C-32912Nana NellM342904.15LowWorkingLaborersSemi-Urban141260.03477.19NN4.0647.870Inactive9222904.152Semi-Urban1226298.500.00
29993C-21440Jonathon RodriquesM62NaNLowWorkingCleaning staffSemi-Urban9811.65107.88NN2.0709.431Active404NaN2Urban117956.316377.57
29994C-7813Jocelyn DeschampM392250.19LowCommercial associateManagersRural83810.38430.66YY3.0NaN0Inactive2652250.193Urban1129028.3362857.78
29995C-43723Angelyn ClevengerM384969.41LowCommercial associateManagersUrban76657.90722.34YY2.0869.610Unpossessed5664969.414Urban1111096.5668992.11
29996C-32511Silas SlaughM201606.88LowWorkingLaborersSemi-Urban66595.14253.04NN3.0729.410Inactive1751606.883Urban173453.9446616.60
29997C-5192Carmelo LoneF49NaNLowWorkingSales staffUrban81410.08583.11NYNaNNaN0Active959NaN1Rural1102108.0261057.56
29998C-12172Carolann OsbyM382417.71LowWorkingSecurity staffSemi-Urban142524.10378.29NY3.0677.271Unpossessed3752417.714Urban1168194.4799766.87
29999C-33003Bridget GaribaldiF633068.24HighPensionerNaNRural156290.54693.94NY1.0815.440Active3443068.243Rural1194512.60117217.90